Goto

Collaborating Authors

 hybrid nonlinear domain


Scalable Planning with Tensorflow for Hybrid Nonlinear Domains

Neural Information Processing Systems

Given recent deep learning results that demonstrate the ability to effectively optimize high-dimensional non-convex functions with gradient descent optimization on GPUs, we ask in this paper whether symbolic gradient optimization tools such as Tensorflow can be effective for planning in hybrid (mixed discrete and continuous) nonlinear domains with high dimensional state and action spaces? To this end, we demonstrate that hybrid planning with Tensorflow and RMSProp gradient descent is competitive with mixed integer linear program (MILP) based optimization on piecewise linear planning domains (where we can compute optimal solutions) and substantially outperforms state-of-the-art interior point methods for nonlinear planning domains. Furthermore, we remark that Tensorflow is highly scalable, converging to a strong plan on a large-scale concurrent domain with a total of 576,000 continuous action parameters distributed over a horizon of 96 time steps and 100 parallel instances in only 4 minutes. We provide a number of insights that clarify such strong performance including observations that despite long horizons, RMSProp avoids both the vanishing and exploding gradient problems. Together these results suggest a new frontier for highly scalable planning in nonlinear hybrid domains by leveraging GPUs and the power of recent advances in gradient descent with highly optimized toolkits like Tensorflow.


Reviews: Scalable Planning with Tensorflow for Hybrid Nonlinear Domains

Neural Information Processing Systems

This paper documents an approach to planning in domains with hybrid state and action spaces, using efficient stochastic gradient descent methods, in particular, in this case, as implemented in TensorFlow. Certainly, the idea of optimizing plans or trajectories using gradient methods is not new (lots of literature on shooting methods, using fmincon, etc. exists). And, people have long understood that random restarts for gradient methods in non-linear problems are a way to mitigate problems with local optimal. What this paper brings is (1) the idea of running those random restarts in parallel and (b) using an existing very efficient implementation of SGD. I'm somewhat torn, because it seems like a good idea, but also not very surprising.


Scalable Planning with Tensorflow for Hybrid Nonlinear Domains

Wu, Ga, Say, Buser, Sanner, Scott

Neural Information Processing Systems

Given recent deep learning results that demonstrate the ability to effectively optimize high-dimensional non-convex functions with gradient descent optimization on GPUs, we ask in this paper whether symbolic gradient optimization tools such as Tensorflow can be effective for planning in hybrid (mixed discrete and continuous) nonlinear domains with high dimensional state and action spaces? To this end, we demonstrate that hybrid planning with Tensorflow and RMSProp gradient descent is competitive with mixed integer linear program (MILP) based optimization on piecewise linear planning domains (where we can compute optimal solutions) and substantially outperforms state-of-the-art interior point methods for nonlinear planning domains. Furthermore, we remark that Tensorflow is highly scalable, converging to a strong plan on a large-scale concurrent domain with a total of 576,000 continuous action parameters distributed over a horizon of 96 time steps and 100 parallel instances in only 4 minutes. We provide a number of insights that clarify such strong performance including observations that despite long horizons, RMSProp avoids both the vanishing and exploding gradient problems. Together these results suggest a new frontier for highly scalable planning in nonlinear hybrid domains by leveraging GPUs and the power of recent advances in gradient descent with highly optimized toolkits like Tensorflow.